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Abstract 

Data compression will play an increasingly important role in the storage and trans- 
mission of image data within the NASA science programs as the Earth Observing 
System comes into operation. It is important that the science data be preserved at 
the fidelity the instrument and satellite communication systems were designed to pro- 
duce. Lossless compression must therefore be applied, at least, to archive the processed 
instrument data. In this paper we present an analysis of the performance of lossless 
compression techniques and develop an adaptive approach which applied image remap- 
ping, feature-based image segmentation to determine regions of similar entropy, and 
high-order arithmetic coding to obtain significant improvements over the use of conven- 
tional compression techniques alone. Image remapping is used to transform the original 
image into a lower entropy state. Several techniques were tested on satellite images 
including differential pulse code modulation, bi-linear interpolation, and block-based 
linear predictive coding. The results of these experiments are discussed and trade-offs 
between computation requirements and entropy reductions are used to identify the 
optimum approach for a variety of satellite images. Further entropy reduction can be 
achieved by segmenting the image based on local entropy properties then applying a 
coding technique which maximizes compression for the region. Experimental results 
are presented showing the effect of different coding techniques for regions of different 
entropy. A rule-base is developed through which the technique giving the best com- 
pression is selected. The paper concludes that maximum compression can be achieved 
cost effectively and at acceptable performance rates with a combination of techniques 
which are selected based on image contextual information. 
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1 Introduction 


With a steadily growing use of imaging technology in almost all computerized scientific 
fields the need for better image compression for purposes of minimizing transmission time 
and storage space/cost is ever present. Lossless image compression which permits faithful 
reconstruction of the original image is important in maintaining accurate image archives of 
digitized documents, in remote sensing image storage and retrieval, and in medical imag- 
ing where loss of fidelity due lossy compression can compromise radiological diagnosis. The 
techniques for lossless compression basically attempt to re-code image data such that redun- 
dant elements are coded with the least number of bits possible by using the frequency of 
occurrence of elements to determine the number of bits to use to code the image [l] [2]. The 
amount of compression obtained is related to the degree of redundancy present in the image. 
To obtain higher compression over the basic approach, preprocessing techniques which at- 
tempt to decorrelate image pixel values [3] and source modeling techniques which attempt to 
use the context of local pixel values [4] have been tried. One- and two-dimensional discrete 
pulse code modulation (DPCM), [5] bi-linear interpolation [6], and hierarchical interpolation 
[7] techniaues are typical of the decorrelation approaches that have been applied to improve 
image compression. Prior to coding an image, a statistical model of the image is needed 
which can be developed in a separate pass over the image or adaptively as the image pixels 
are being coded [8]. Zero-order models consider each pixel to be independent of its neigh- 
bors. Higher-order models collect statistics on sequences of adjacent pixels [9]. Higher-order 
statistical models provide better compression where images are smooth and regular and zero 
order models are more effective where the images have a large high frequency content [10]. 
Investigators have shown that combinations of these techniques when applied to particular 
classes of images provide improved compression, but no one combination is suitable for all 
images, particularly satellite images where the image texture can change dramatically over 
small spacial distances, from smooth desert regions to rough snow-capped mountains. Based 
on work done on lossless image compression for medical images by the authors [11] and others 
[12], it is believed that satellite imagery would best be compressed with multiple compression 
techniques which are adaptively selected based on properties of local image regions. Image 
regions which contain a large degree of prominent fine texture will not compress well because 
little data redundancy and inter-pixel correlation will exist. For such regions, the simplest 
least computationally expensive compression technique should be applied. For regions with 
smooth textures or regular patterns, image decorrelation and high-order modeling will pro- 
duce the highest compression. For regions with characteristics between these two extremes, 
the most compression will be achieved with some combination of decorrelation and statisti- 
cal modeling with the order of the statistics selected based on some measure of inter-pixel 
pattern repetition. The problem then becomes the identification of image features which 
can be used to select- the decorrelation, modeling, and encoding techniques to give maximum 
compression at minimum ”cost”: Cost being a measure of the computational requirements 
for a given approach against the degree of improvement in compression achieved by using 
it. If such a feature set can be found, then the image to be compressed could be divided 
into arbitrarily small regions, the features calculated for each region, the regions classified 
according to the best compression techniques to apply, and then similarly classified regions 
compressed appropriately (Figure 1). A pipelined process of image analysis, feature ex- 
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traction, classification, decorrelation followed by modeling and coding is shown in Figure 
1. Parallelization is possible due to the region processing approach used. Feature extrac- 
tion and region classification are performed in parallel on each image region. The classifier 
determines the correct compression approaches to used based on a decision tree approach 
using the calculated features. Image regions that will use a decorrelation preprocessor will 
be passed through that path, other regions will be processed by the modeler and coder 
appropriate to the type of region. The modeler and coder are serialized processes in this 
architecture because adaptive modeling will be used, that is, the statistical model is built 
as the image pixels are being compressed. All image regions that are similarly classified 
will be processed through the same modeler/coder path so that a unique statistical model is 
generated for each type of region. The modeler and coder will process all pixels in a region 
then proceed to the next region, and so on. This procedure preserves any two-dimensional 
correlation of pixel values and should maximize compression. To be able to decompress the 
image file, it is necessary that the compressed file output by this architecture contains a 
header which records a classifier identifier for each region. The image can be reconstructed 
by reading the region classifier identifier and decompressing the region through the reverse 
of the compression process. 

The application of a decorrelation step prior to modeling and coding was also considered 
after proving the suitability of this approach without it. Decorrelating the image reduces 
pixel value variance [10] and therefore improves compression. Two decorrelation methods 
were used, DPCM and bi-linear interpolation. Both are relatively simple to implement with 
DPCM using one or more previous pixel values to predict the current pixel value, and the 
lattice points in a 2-dimensional kernel of pixels being used to predict the other kernel pixel 
values. In the case of bi-linear interpolation, the output is two data streams, a set of predic- 
tion errors, and the value of the lattice points for each non-overlapping 2-dimensional kernel 
in the image. The computational cost of the bi-linear interpolation process is considerably 
more than the DPCM approach, but the improvement in compression can be considerable 
for certain types of images. 

2 Image Analysis and Feature Extraction 

The key to the success of this adaptive scheme for compression lies in the selection of features 
which can usefully classify image regions into the optimum compression approach and thereby 
minimize the compressed file size. Their exist no theoretical foundation for determining the 
feature set that would provide optimum compression selection. An empirical approach was 
take to determine the best feature set: Satellite images were selected for compression which 
contained a large variety of textures. The images were divided into symmetrical regions and 
each region was compressed using several decorrelation and lossless compression techniques. 
Several features were calculated for each region. Using cluster analysis techniques to identify 
unique reions in feature space, a binary decision tree classifier was developed using features 
from a large variety regions. The features selected for the initial training of the classifier were 
determined by evaluation of the compression process and from previous work on medical 
image compression [11], [13]. There are several well known techniques for estimating the 
amount and degree of texture in an image, including co-occurrence matrices [14], sum and 
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Figure 1: Architecture of the image compression approach. 


difference matrices [15], and local grey-scale dependance matrices [16]. Due to the memory 
requirements of these approaches (at least the same amount as the original image), we elected 
to use an feature set less resource demanding. The compressibility of an image is related 
to the range and distribution of intensities in the image, the individual pixel entropy, and 
the degree of variability in local pixel intensity values. Based on this knowledge, the initial 
feature included: 

Average pixel intensity — The mean over the region of the pixel values, selected so that 
regions similar brightness would be group together: 

_ Ei, : p(hj) 

x= ir~ 

where, i, j are the rows and columns, respectively, in the n x n subimage, N = n + n, and 
p(i,j) the intensity of pixel i,j. 

Pixel Intensity Variance - The variability in brightness over the region: 

p _ T,iAp(hi ).-*) 2 
v ~ N 

First-Order Entropy - The amount of pixel value redundancy present in the region: 

H = i jr,(-Pi,jlog 2 p l ,j) 
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Average Run-length - The average value of the run-length sections generated by thresh- 
olding the subimage with the mean gray-level: 


R a = 


M 


where, l,j are the rows and columns in an m x n subimage runlength matrix, M = m + n, 
and R is the distance of the run length. 

Run-length Variance - The variance in run lengths obtained by thresholding the subimage 
with the mean gray-level: 


Rv ~ M 

The images analyzed in this process were all of sea-ice taken from spectral band 3 from 
AVHRR satellite data. As shown in Figure 2, sea-ice images ( 128xl28xl6-bit) in this spectral 
band provide a wide variety of textures from smooth to very rough. The compressibility of 
regions within this large image varies considerably: with the left, middle and right images in 
the Figure compressing to 35%, 39%, and 28%, respectively, when arithmetic coding is used 
with first-order modeling. As would be expected, the less textured images compress better. 


3 Decision Tree Classifier 

An unsupervised classifier method was used to form clusters in feature space, and cluster 
analysis [17] used to allocate class regions based on minimizing intra-class second-order mo- 
ments. In the training process, subimages feature vectors containing the above features 
were calculated for a training set. As each subimage was processed, its feature vector was 
calculated, it’s nearest neighbor in feature space located, the new centroid of the region 
calculated, and the moments of the vector in that class calculated. Class region boundaries 
were recalculated when the second moments for the region started to diverge. A binary deci- 
sion tree was selected for classification using the Kolmogorov- Smirnov test [18] to determine 
the threshold value for each feature at each node in the decision tree. This results in the 
selection of a feature at each node which has maximum separation from the nearest neighbor 
in feature space. Figure 4 shows a representation of a binary decision tree based on the 
features identified in the previous section. 

The classifier output is a class identifier for the subimage processed. This is then used 
to create a map of the subimages within the image. The map is used as header information, 
prepended to the compressed image file, and used by a decompression process to recover the 
original image. 


177 





Figure 3: Sea-ice images showing light and dark evenly textured regions, and a region with 
high edge content 

4 Compression Results 

We selected several compression methods based on their ability to compress medical images. 
These were : Lempel-Ziv dictionary coding with a 15-bit code [19] (LZ), Huffman coding 
with adaptive modeling (HC) arithmetic coding with static modeling (Ar) arithmetic coding 
with zero-order modeling (Ar-0) arithmetic coding with adaptive first-order modeling (Ar-1) 
arithmetic coding with adaptive second-order modeling (Ar-2) 

We ran sixteen 512x5l2xl6-bit sea ice images through each of these processes to obtain 
a baseline of performance for each compression method. Table 1 shows the results of this 
baseline for three types of texture: Smooth regions with very little or gradual changes in 
textures; moderately textured regions similar tp the middle image in Figure 3; and, regions 
including a high density of gray-scale variations. Next images were divided into 64x64x16- 
bit subimages and the compression methods run on each subimage. The compression results 
were examined to determine the best compression method for each of the subimages. The 
best compression technique was always one from LZ, Ar- 0, Ar-1, or Ar-2 with a bilinear 
interpolation decorrelation preprocessing step. The number of classes (k) to be considered 
was then taken as four. The feature set mentioned above was extracted for each image 
and cluster analysis performed for the subimages from the entire image set. An automatic 
classifier was built using the previously mentioned procedure and a binary decision tree was 
generated. The 16 512x512 images were then run through the adaptive process shown in 
Figure 1 and the compression values shown in Table 2 obtained. 


5 Conclusions 

The adaptive application of a variety of compression techniques on satellite images as op- 
posed to applying one technique appears to give an improvement in lossless compression 
in order of 15we have presented to select the lossless compression technique is to calculate 
features of subimages with the image and use then feature values in a binary decision tree 
to select the best compression technique for that subimage. While these results by no means 
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Figure 4: Binary decision tree to select the optimal compression method based on selected 
feature values 


Compression 

Method 

Smooth 

Image 

Moderate 

Image 

Granular 

Image 

Huffman 

46 

34 

26 

P AR 

46 

35 

27 

LZ 

54 

46 

35 

Ar-0 

51 

47 

44 

AR-1 

68 

55 

49 

AR-2 

68 

51 

43 


Table 1: The averaged values for the compression methods tested with compression expressed 
as a percentage of source/compressed file size 


Smooth 

Image 

Moderate 

Image 

Granular 

Image 

71 

62 

56 


Table 2: Adaptive compression applied to the same images showing better compression than 
the best of Table 1 


j 
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imply that this approach will universally provide better compression for all satellite images, 
they do indicate that further work in expanding the test database is merited. Expansion 
of the compression approaches to include a decorrelation preprocessing step, as described 
previously should be undertaken. The feature set used to select the compression method 
was adequate, but attention needs to be paid to using the most cost/effective features from 
a processing point-of-view. It is believed that this compression approach using low-cost 
multiprocessing architectures and large secondary caches on each processor, will provide 
compression performance suitable of use in satellite imagery systems such as TRMM and 
EOS. 
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